Critical Dimension in Data Mining

نویسندگان

  • Divya Suryakumar
  • Andrew H. Sung
  • Qingzhong Liu
چکیده

Data mining is an increasingly important means of knowledge acquisition for many applications in diverse fields such as biology, medicine, management, engineering, etc. When tackling a large-scale problem that involves a multitude of potentially relevant factors but lacking a precise formulation or mathematical characterization to allow formal approaches to solution, the available data collected for the application can often be mined to extract knowledge about the problem. Feature ranking and selection, thereby, are immediate issues to consider when one prepares to perform data mining, and the literature contains numerous theoretical and empirical methods of feature selection for a variety of problems. This work in progress paper concerns the related question of critical dimension, i.e., for a specific data mining task, does there exist a minimum number (of features) which is required for a specific learning machine to achieve satisfactory performance? As a first step in addressing this question, a simple ad-hoc method is employed for experiment and it is shown that the phenomenon of critical dimension indeed exists for several of the datasets studied. The implications are that each of these datasets contains irrelevant features or input attributes, which can be eliminated to achieve higher accuracy in model building using learning machines. Keywords-feature selection; critical dimension; machine learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking sawability of dimension stone using PROMETHEE method

Predicting the sawability of the dimension stone is one of the most important factors involved in production planning. Moreover, this factor can be used as an important criterion in the cost estimation and planning of the stone plants. The main purpose for carrying out this work was to rank the sawability of the dimension stone using the PROMETHEE method. In this research work, four important p...

متن کامل

Estimation of the Ampere Consumption of Dimension Stone Sawing Machine Using of Artificial Neural Networks

Nowadays, estimating the ampere consumption and achieve to the optimum condition from the perspective of energy consumption is one of the most important steps to reduce the production costs. In this research it is tried to develop an accurate model for estimating the ampere consumption by using the artificial neural networks (ANN).In the first step, experimental studies were carried out on 7 ca...

متن کامل

Credit Card Fraud Detection using Data mining and Statistical Methods

Due to today’s advancement in technology and businesses, fraud detection has become a critical component of financial transactions. Considering vast amounts of data in large datasets, it becomes more difficult to detect fraud transactions manually. In this research, we propose a combined method using both data mining and statistical tasks, utilizing feature selection, resampling and cost-...

متن کامل

Fault Mode Analyze of Power System Based on Data Mining

On power system operation status monitoring, operating performance analysis and assessment is to ensure the safe operation of its important components. This paper presents a new type of data mining based on the fault mode analysis and the fast diagnostic reasoning algorithm. Fault appearance to be collected and cleaned up in a fault information dimension table, the relationship rule dimension t...

متن کامل

Tire demand planning based on reliability and operating environment

Tires represent a critical spare part in mines. There is a shortage of medium and large tires. In addition, with increased mining activities and the creation of new mines, the demand for tires has increased significantly. Thus, it is particularly important for mining engineers to identify tire characteristics and correctly manage the spare part inventory. Spare parts management is critical from...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012